77 research outputs found
Bayesian Statistical Methods for Genetic Association Studies with Case-Control and Cohort Design
Large-scale genetic association studies are carried out with the hope of discovering single
nucleotide polymorphisms involved in the etiology of complex diseases. We propose a
coalescent-based model for association mapping which potentially increases the power to
detect disease-susceptibility variants in genetic association studies with case-control and cohort
design. The approach uses Bayesian partition modelling to cluster haplotypes with
similar disease risks by exploiting evolutionary information. We focus on candidate gene
regions and we split the chromosomal region of interest into sub-regions or windows of high
linkage disequilibrium (LD) therein assuming a perfect phylogeny. The haplotype space is
then partitioned into disjoint clusters within which the phenotype-haplotype association is
assumed to be the same. The novelty of our approach consists in the fact that the distance
used for clustering haplotypes has an evolutionary interpretation, as haplotypes are clustered
according to the time to their most recent common mutation. Our approach is fully
Bayesian and we develop Markov Chain Monte Carlo algorithms to sample efficiently over
the space of possible partitions. We have also developed a Bayesian survival regression model
for high-dimension and small sample size settings. We provide a Bayesian variable selection
procedure and shrinkage tool by imposing shrinkage priors on the regression coefficients. We
have developed a computationally efficient optimization algorithm to explore the posterior
surface and find the maximum a posteriori estimates of the regression coefficients. We compare
the performance of the proposed methods in simulation studies and using real datasets
to both single-marker analyses and recently proposed multi-marker methods and show that
our methods perform similarly in localizing the causal allele while yielding lower false positive
rates. Moreover, our methods offer computational advantages over other multi-marker
approaches
Bayesian survival analysis in genetic association studies
Motivation: Large-scale genetic association studies are carried out with the hope of discovering single nucleotide polymorphisms involved in the etiology of complex diseases. There are several existing methods in the literature for performing this kind of analysis for case-control studies, but less work has been done for prospective cohort studies. We present a Bayesian method for linking markers to censored survival outcome by clustering haplotypes using gene trees. Coalescent-based approaches are promising for LD mapping, as the coalescent offers a good approximation to the evolutionary history of mutations
Using ancestry-informative markers to identify fine structure across 15 populations of European origin
The Wellcome Trust Case Control Consortium 3 anorexia nervosa genome-wide association scan includes 2907 cases from 15 different populations of European origin genotyped on the Illumina 670K chip. We compared methods for identifying population stratification, and suggest list of markers that may help to counter this problem. It is usual to identify population structure in such studies using only common variants with minor allele frequency (MAF) >5% we find that this may result in highly informative SNPs being discarded, and suggest that instead all SNPs with MAF >1% may be used. We established informative axes of variation identified via principal component analysis and highlight important features of the genetic structure of diverse European-descent populations, some studied for the first time at this scale. Finally, we investigated the substructure within each of these 15 populations and identified SNPs that help capture hidden stratification. This work can provide information regarding the designing and interpretation of association results in the International Consortia
Recommended from our members
Rare variant contribution to human disease in 281,104 UK Biobank exomes.
Genome-wide association studies have uncovered thousands of common variants associated with human disease, but the contribution of rare variants to common disease remains relatively unexplored. The UK Biobank contains detailed phenotypic data linked to medical records for approximately 500,000 participants, offering an unprecedented opportunity to evaluate the effect of rare variation on a broad collection of traits1,2. Here we study the relationships between rare protein-coding variants and 17,361 binary and 1,419 quantitative phenotypes using exome sequencing data from 269,171 UK Biobank participants of European ancestry. Gene-based collapsing analyses revealed 1,703 statistically significant gene-phenotype associations for binary traits, with a median odds ratio of 12.4. Furthermore, 83% of these associations were undetectable via single-variant association tests, emphasizing the power of gene-based collapsing analysis in the setting of high allelic heterogeneity. Gene-phenotype associations were also significantly enriched for loss-of-function-mediated traits and approved drug targets. Finally, we performed ancestry-specific and pan-ancestry collapsing analyses using exome sequencing data from 11,933 UK Biobank participants of African, East Asian or South Asian ancestry. Our results highlight a significant contribution of rare variants to common disease. Summary statistics are publicly available through an interactive portal ( http://azphewas.com/ )
Genetic architecture distinguishes systemic juvenile idiopathic arthritis from other forms of juvenile idiopathic arthritis: Clinical and therapeutic implications
Objectives Juvenile idiopathic arthritis (JIA) is a heterogeneous group of conditions unified by the presence of chronic childhood arthritis without an identifiable cause. Systemic JIA (sJIA) is a rare form of JIA characterised by systemic inflammation. sJIA is distinguished from other forms of JIA by unique clinical features and treatment responses that are similar to autoinflammatory diseases. However, approximately half of children with sJIA develop destructive, long-standing arthritis that appears similar to other forms of JIA. Using genomic approaches, we sought to gain novel insights into the pathophysiology of sJIA and its relationship with other forms of JIA. Methods We performed a genome-wide association study of 770 children with sJIA collected in nine countries by the International Childhood Arthritis Genetics Consortium. Single nucleotide polymorphisms were tested for association with sJIA. Weighted genetic risk scores were used to compare the genetic architecture of sJIA with other JIA subtypes. Results The major histocompatibility complex locus and a locus on chromosome 1 each showed association with sJIA exceeding the threshold for genome-wide significance, while 23 other novel loci were suggestive of association with sJIA. Using a combination of genetic and statistical approaches, we found no evidence of shared genetic architecture between sJIA and other common JIA subtypes. Conclusions The lack of shared genetic risk factors between sJIA and other JIA subtypes supports the hypothesis that sJIA is a unique disease process and argues for a different classification framework. Research to improve sJIA therapy should target its unique genetics and specific pathophysiological pathways
The African Genome Variation Project shapes medical genetics in Africa.
Given the importance of Africa to studies of human origins and disease susceptibility, detailed characterization of African genetic diversity is needed. The African Genome Variation Project provides a resource with which to design, implement and interpret genomic studies in sub-Saharan Africa and worldwide. The African Genome Variation Project represents dense genotypes from 1,481 individuals and whole-genome sequences from 320 individuals across sub-Saharan Africa. Using this resource, we find novel evidence of complex, regionally distinct hunter-gatherer and Eurasian admixture across sub-Saharan Africa. We identify new loci under selection, including loci related to malaria susceptibility and hypertension. We show that modern imputation panels (sets of reference genotypes from which unobserved or missing genotypes in study sets can be inferred) can identify association signals at highly differentiated loci across populations in sub-Saharan Africa. Using whole-genome sequencing, we demonstrate further improvements in imputation accuracy, strengthening the case for large-scale sequencing efforts of diverse African haplotypes. Finally, we present an efficient genotype array design capturing common genetic variation in Africa
Rare Variant Analysis of Human and Rodent Obesity Genes in Individuals with Severe Childhood Obesity
A. Palotie on työryhmän UK10K Consortium jäsen.Obesity is a genetically heterogeneous disorder. Using targeted and whole-exome sequencing, we studied 32 human and 87 rodent obesity genes in 2,548 severely obese children and 1,117 controls. We identified 52 variants contributing to obesity in 2% of cases including multiple novel variants in GNAS, which were sometimes found with accelerated growth rather than short stature as described previously. Nominally significant associations were found for rare functional variants in BBS1, BBS9, GNAS, MKKS, CLOCK and ANGPTL6. The p.S284X variant in ANGPTL6 drives the association signal (rs201622589, MAF similar to 0.1%, odds ratio = 10.13, p-value = 0.042) and results in complete loss of secretion in cells. Further analysis including additional case-control studies and population controls (N = 260,642) did not support association of this variant with obesity (odds ratio = 2.34, p-value = 2.59 x 10(-3)), highlighting the challenges of testing rare variant associations and the need for very large sample sizes. Further validation in cohorts with severe obesity and engineering the variants in model organisms will be needed to explore whether human variants in ANGPTL6 and other genes that lead to obesity when deleted in mice, do contribute to obesity. Such studies may yield druggable targets for weight loss therapies.Peer reviewe
Uganda Genome Resource Enables Insights into Population History and Genomic Discovery in Africa.
Genomic studies in African populations provide unique opportunities to understand disease etiology, human diversity, and population history. In the largest study of its kind, comprising genome-wide data from 6,400 individuals and whole-genome sequences from 1,978 individuals from rural Uganda, we find evidence of geographically correlated fine-scale population substructure. Historically, the ancestry of modern Ugandans was best represented by a mixture of ancient East African pastoralists. We demonstrate the value of the largest sequence panel from Africa to date as an imputation resource. Examining 34 cardiometabolic traits, we show systematic differences in trait heritability between European and African populations, probably reflecting the differential impact of genes and environment. In a multi-trait pan-African GWAS of up to 14,126 individuals, we identify novel loci associated with anthropometric, hematological, lipid, and glycemic traits. We find that several functionally important signals are driven by Africa-specific variants, highlighting the value of studying diverse populations across the region.Main funding:
This work was funded by the Wellcome Trust, The Wellcome Sanger Institute (WT098051), the U.K. Medical Research Council (G0901213-92157, G0801566, and MR/K013491/1), and the Medical Research Council/Uganda Virus Research Institute Uganda Research Unit on AIDS core funding
Recommended from our members
A genome-wide association study of anorexia nervosa
Anorexia nervosa (AN) is a complex and heritable eating disorder characterized by dangerously low body weight. Neither candidate gene studies nor an initial genome wide association study (GWAS) have yielded significant and replicated results. We performed a GWAS in 2,907 cases with AN from 14 countries (15 sites) and 14,860 ancestrally matched controls as part of the Genetic Consortium for AN (GCAN) and the Wellcome Trust Case Control Consortium 3 (WTCCC3). Individual association analyses were conducted in each stratum and meta-analyzed across all 15 discovery datasets. Seventy-six (72 independent) SNPs were taken forward for in silico (two datasets) or de novo (13 datasets) replication genotyping in 2,677 independent AN cases and 8,629 European ancestry controls along with 458 AN cases and 421 controls from Japan. The final global meta-analysis across discovery and replication datasets comprised 5,551 AN cases and 21,080 controls. AN subtype analyses (1,606 AN restricting; 1,445 AN binge-purge) were performed. No findings reached genome-wide significance. Two intronic variants were suggestively associated: rs9839776 (P=3.01×10−7) in SOX2OT and rs17030795 (P=5.84×10−6) in PPP3CA. Two additional signals were specific to Europeans: rs1523921 (P=5.76×10−6) between CUL3 and FAM124B and rs1886797 (P=8.05×10−6) near SPATA13. Comparing discovery to replication results, 76% of the effects were in the same direction, an observation highly unlikely to be due to chance (P= 4×10−6), strongly suggesting that true findings exist but that our sample, the largest yet reported, was underpowered for their detection. The accrual of large genotyped AN case-control samples should be an immediate priority for the field
- …